Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs
نویسندگان
چکیده
Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.
منابع مشابه
Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملDiscovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملStudy of PKA binding sites in cAMP-signaling pathway using structural protein-protein interaction networks
Backgroud: Protein-protein interaction, plays a key role in signal transduction in signaling pathways. Different approaches are used for prediction of these interactions including experimental and computational approaches. In conventional node-edge protein-protein interaction networks, we can only see which proteins interact but ‘structural networks’ show us how these proteins inter...
متن کاملProtein multiple sequence alignment benchmarking through secondary structure prediction
Motivation Multiple sequence alignment (MSA) is commonly used to analyze sets of homologous protein or DNA sequences. This has lead to the development of many methods and packages for MSA over the past 30 years. Being able to compare different methods has been problematic and has relied on gold standard benchmark datasets of 'true' alignments or on MSA simulations. A number of protein benchmark...
متن کاملEvolutionary Analysis of Mammalian ACE2 and the Key Residues Involved in Binding to the Spike Protein Revealed Potential SARS-CoV-2 Hosts
Introduction: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spilled over to humans via wild mammals, entering the host cell using angiotensin-converting enzyme 2 (ACE2) as receptor through Spike (S) protein binding. While SARS-CoV-2 became fully adapted to humans and globally spread, some mammal species were infected back. The present study evaluated the potential risk of mammals...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 12 شماره
صفحات -
تاریخ انتشار 2017